Conversation
There was a problem hiding this comment.
Pull request overview
Adds LFM2 tool-call parsing support to OVMS LLM I/O processing, wiring it into the existing OutputParser tool-parser selection and providing a dedicated LFM2 output parser test suite.
Changes:
- Introduced
Lfm2ToolParser(unary + streaming parsing) undersrc/llm/io_processing/lfm2/. - Registered the new parser in
OutputParser(parser name"lfm2") and Bazel build targets. - Added comprehensive LFM2 output parser tests (including streaming scenarios and malformed inputs).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
src/test/llm/output_parsers/lfm2_output_parser_test.cpp |
New gtest coverage for LFM2 tool-call parsing (unary + streaming). |
src/llm/io_processing/output_parser.cpp |
Registers "lfm2" tool parser and includes its header. |
src/llm/io_processing/lfm2/tool_parser.hpp |
Declares Lfm2ToolParser state machine and parsing helpers. |
src/llm/io_processing/lfm2/tool_parser.cpp |
Implements LFM2 unary + streaming parsing and argument normalization. |
src/llm/BUILD |
Adds Bazel library target for the LFM2 parser and links it into output_parsers. |
| std::optional<rapidjson::Document> Lfm2ToolParser::parseChunk(const std::string& chunk, ov::genai::GenerationFinishReason finishReason) { | ||
| if (chunk.empty()) { | ||
| return std::nullopt; | ||
| } | ||
|
|
||
| this->streamingContent += chunk; | ||
|
|
||
| if (parseNewContent()) { | ||
| if (this->currentState == State::ToolCallParameters) { | ||
| return BaseOutputParser::wrapFirstDelta(this->toolCall.name, toolCallIndex); | ||
| } | ||
| if (this->currentState == State::ToolCallEnded) { | ||
| return wrapDeltaArgs(this->toolCall.arguments, toolCallIndex); | ||
| } | ||
| if (this->currentState == State::Content) { | ||
| auto content = this->streamingContent.substr(this->streamingPosition); | ||
| this->streamingPosition += content.size(); | ||
| return wrapDeltaContent(content); | ||
| } | ||
| if (this->currentState == State::AfterToolCall) { | ||
| this->currentState = State::Content; | ||
| } | ||
| } | ||
|
|
||
| return std::nullopt; | ||
| } |
There was a problem hiding this comment.
Streaming: streamingContent is only appended to and never trimmed, while streamingPosition monotonically increases. For long responses this can grow without bound and increase substring/search costs over time. Consider erasing the processed prefix (or keeping only an unprocessed tail) once streamingPosition moves forward, to keep memory and time bounded.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| const std::string Lfm2ToolParser::TOOL_ARGS_END_INDICATOR = ")"; | ||
| const std::string Lfm2ToolParser::TOOL_SEPARATOR_STR = ", "; | ||
|
|
||
| void Lfm2ToolParser::writeArgumentOfAnyType(const rapidjson::Value& arg, rapidjson::Writer<rapidjson::StringBuffer>& writer) { |
There was a problem hiding this comment.
If that's a standalone function could it be static?
Also it sounds quite generic. Can we extract it to utils? Not sure if we already have such file in io_processing...
| } | ||
| writer.EndObject(); | ||
| } else { | ||
| writer.String(""); |
There was a problem hiding this comment.
Should we write anything if logging error?
There was a problem hiding this comment.
in other case the key will be left like this {key: bad_value, another_key: value} -> {"key", "another_key":"value"}
| const char last = normalized.back(); | ||
| if ((first == '{' && last == '}') || (first == '[' && last == ']')) { | ||
| std::replace(normalized.begin(), normalized.end(), '\'', '"'); | ||
| SPDLOG_LOGGER_TRACE(llm_calculator_logger, "Argument contains curly braces or square brackets, replaced single quotes with double quotes for JSON parsing. Modified string: {}", normalized); |
There was a problem hiding this comment.
Is it always safe? What if we get argument with list of strings:
["hello", "it's me"]
?
There was a problem hiding this comment.
it's right, I will think how can I change it
There was a problem hiding this comment.
I've changed it and tested with tests ParseToolCallWithStringArgumentsArrayWithStringsContainingQuotes and ParseToolCallWithStringArgumentsObjectWithStringsContainingQuotes
| size_t pos = 0; | ||
| int main_guard = 0; | ||
|
|
||
| while (pos != std::string::npos && main_guard < MAX_TOOL_CALLS) { |
There was a problem hiding this comment.
Is this guard really needed?
There was a problem hiding this comment.
This parser will be used with models from 2.6B to 24B, if it would use 2.6B for multiturn it gets confused and generates tool calls and doesn't follow it's usual template. To avoid infinity loops I would live it
| size_t pos = this->streamingContent.find(TOOL_CALL_START_TAG, this->streamingPosition); | ||
| size_t toolCallEndTagPos = this->streamingContent.find(TOOL_CALL_END_TAG, this->streamingPosition); | ||
| if (toolCallEndTagPos != std::string::npos && pos == std::string::npos) { | ||
| SPDLOG_LOGGER_TRACE(llm_calculator_logger, "Detected end of tool call at position: {}", toolCallEndTagPos); |
There was a problem hiding this comment.
So in that point we know that tools were generated? Won't it catch end of tool even if model did not output any tools but for example just and array of numbers?
🛠 Summary
CVS-182500
Adding toolp parser for LFM2 model
🧪 Checklist
``